semcom system
Generative Feature Imputing -- A Technique for Error-resilient Semantic Communication
Huang, Jianhao, Zeng, Qunsong, Du, Hongyang, Huang, Kaibin
--Semantic communication (SemCom) has emerged as a promising paradigm for achieving unprecedented communication efficiency in sixth-generation (6G) networks by leveraging artificial intelligence (AI) to extract and transmit the underlying meanings of source data. However, deploying SemCom over digital systems presents new challenges, particularly in ensuring robustness against transmission errors that may distort semantically critical content. T o address this issue, this paper proposes a novel framework, termed generative feature imputing, which comprises three key techniques. First, we introduce a spatial-error-concentration packetization strategy that spatially concentrates feature distortions by encoding feature elements based on their channel mappings--a property crucial for both the effectiveness and reduced complexity of the subsequent techniques. Second, building on this strategy, we propose a generative feature imputing method that utilizes a diffusion model to efficiently reconstruct missing features caused by packet losses. Finally, we develop a semantic-aware power allocation scheme that enables unequal error protection by allocating transmission power according to the semantic importance of each packet. Experimental results demonstrate that the proposed framework outperforms conventional approaches, such as Deep Joint Source-Channel Coding (DJSCC) and JPEG2000, under block fading conditions, achieving higher semantic accuracy and lower Learned Perceptual Image Patch Similarity (LPIPS) scores. The sixth-generation (6G) wireless networks promise to support a broad range of emerging applications, such as immersive internet-of-things (IoT), multimedia streaming, and augmented reality, which necessitate ultra-high rates and reliability, and low latency [1]-[5]. However, as dictated by Shannon's information theory, these objectives are in conflict with each other given limited radio resources [6].
User-Intent-Driven Semantic Communication via Adaptive Deep Understanding
Ye, Peigen, Duan, Jingpu, Du, Hongyang, Guo, Yulan
Semantic communication focuses on transmitting task-relevant semantic information, aiming for intent-oriented communication. While existing systems improve efficiency by extracting key semantics, they still fail to deeply understand and generalize users' real intentions. To overcome this, we propose a user-intention-driven semantic communication system that interprets diverse abstract intents. First, we integrate a multi-modal large model as semantic knowledge base to generate user-intention prior. Next, a mask-guided attention module is proposed to effectively highlight critical semantic regions. Further, a channel state awareness module ensures adaptive, robust transmission across varying channel conditions. Extensive experiments demonstrate that our system achieves deep intent understanding and outperforms DeepJSCC, e.g., under a Rayleigh channel at an SNR of 5 dB, it achieves improvements of 8%, 6%, and 19% in PSNR, SSIM, and LPIPS, respectively.
A Channel-Triggered Backdoor Attack on Wireless Semantic Image Reconstruction
Wan, Jialin, Cheng, Nan, Shen, Jinglong
Despite the transformative impact of deep learning (DL) on wireless communication systems through data-driven end-to-end (E2E) learning, the security vulnerabilities of these systems have been largely overlooked. Unlike the extensively studied image domain, limited research has explored the threat of backdoor attacks on the reconstruction of symbols in semantic communication (SemCom) systems. Previous work has investigated such backdoor attacks at the input level, but these approaches are infeasible in applications with strict input control. In this paper, we propose a novel attack paradigm, termed Channel-Triggered Backdoor Attack (CT-BA), where the backdoor trigger is a specific wireless channel. This attack leverages fundamental physical layer characteristics, making it more covert and potentially more threatening compared to previous input-level attacks. Specifically, we utilize channel gain with different fading distributions or channel noise with different power spectral densities as potential triggers. This approach establishes unprecedented attack flexibility as the adversary can select backdoor triggers from both fading characteristics and noise variations in diverse channel environments. Moreover, during the testing phase, CT-BA enables automatic trigger activation through natural channel variations without requiring active adversary participation. We evaluate the robustness of CT-BA on a ViT-based Joint Source-Channel Coding (JSCC) model across three datasets: MNIST, CIFAR-10, and ImageNet. Furthermore, we apply CT-BA to three typical E2E SemCom systems: BDJSCC, ADJSCC, and JSCCOFDM. Experimental results demonstrate that our attack achieves near-perfect attack success rate (ASR) while maintaining effective stealth. Finally, we discuss potential defense mechanisms against such attacks.
Sequence Spreading-Based Semantic Communication Under High RF Interference
Barka, Hazem, Kaddoum, Georges, Bennis, Mehdi, Alam, Md Sahabul, Au, Minh
In the evolving landscape of wireless communications, semantic communication (SemCom) has recently emerged as a 6G enabler that prioritizes the transmission of meaning and contextual relevance over conventional bit-centric metrics. However, the deployment of SemCom systems in industrial settings presents considerable challenges, such as high radio frequency interference (RFI), that can adversely affect system performance. To address this problem, in this work, we propose a novel approach based on integrating sequence spreading techniques with SemCom to enhance system robustness against such adverse conditions and enable scalable multi-user (MU) SemCom. In addition, we propose a novel signal refining network (SRN) to refine the received signal after despreading and equalization. The proposed network eliminates the need for computationally intensive end-to-end (E2E) training while improving performance metrics, achieving a 25% gain in BLEU score and a 12% increase in semantic similarity compared to E2E training using the same bandwidth.
Generative Semantic Communication: Architectures, Technologies, and Applications
Ren, Jinke, Sun, Yaping, Du, Hongyang, Yuan, Weiwen, Wang, Chongjie, Wang, Xianda, Zhou, Yingbin, Zhu, Ziwei, Wang, Fangxin, Cui, Shuguang
This paper delves into the applications of generative artificial intelligence (GAI) in semantic communication (SemCom) and presents a thorough study. Three popular SemCom systems enabled by classical GAI models are first introduced, including variational autoencoders, generative adversarial networks, and diffusion models. For each system, the fundamental concept of the GAI model, the corresponding SemCom architecture, and the associated literature review of recent efforts are elucidated. Then, a novel generative SemCom system is proposed by incorporating the cutting-edge GAI technology-large language models (LLMs). This system features two LLM-based AI agents at both the transmitter and receiver, serving as "brains" to enable powerful information understanding and content regeneration capabilities, respectively. This innovative design allows the receiver to directly generate the desired content, instead of recovering the bit stream, based on the coded semantic information conveyed by the transmitter. Therefore, it shifts the communication mindset from "information recovery" to "information regeneration" and thus ushers in a new era of generative SemCom. A case study on point-to-point video retrieval is presented to demonstrate the superiority of the proposed generative SemCom system, showcasing a 99.98% reduction in communication overhead and a 53% improvement in retrieval accuracy compared to the traditional communication system. Furthermore, four typical application scenarios for generative SemCom are delineated, followed by a discussion of three open issues warranting future investigation. In a nutshell, this paper provides a holistic set of guidelines for applying GAI in SemCom, paving the way for the efficient implementation of generative SemCom in future wireless networks.
Large Generative Model-assisted Talking-face Semantic Communication System
Jiang, Feibo, Tu, Siwei, Dong, Li, Pan, Cunhua, Wang, Jiangzhou, You, Xiaohu
The rapid development of generative Artificial Intelligence (AI) continually unveils the potential of Semantic Communication (SemCom). However, current talking-face SemCom systems still encounter challenges such as low bandwidth utilization, semantic ambiguity, and diminished Quality of Experience (QoE). This study introduces a Large Generative Model-assisted Talking-face Semantic Communication (LGM-TSC) System tailored for the talking-face video communication. Firstly, we introduce a Generative Semantic Extractor (GSE) at the transmitter based on the FunASR model to convert semantically sparse talking-face videos into texts with high information density. Secondly, we establish a private Knowledge Base (KB) based on the Large Language Model (LLM) for semantic disambiguation and correction, complemented by a joint knowledge base-semantic-channel coding scheme. Finally, at the receiver, we propose a Generative Semantic Reconstructor (GSR) that utilizes BERT-VITS2 and SadTalker models to transform text back into a high-QoE talking-face video matching the user's timbre. Simulation results demonstrate the feasibility and effectiveness of the proposed LGM-TSC system.
Building the Self-Improvement Loop: Error Detection and Correction in Goal-Oriented Semantic Communications
Li, Peizheng, Lin, Xinyi, Aijaz, Adnan
Error detection and correction are essential for ensuring robust and reliable operation in modern communication systems, particularly in complex transmission environments. However, discussions on these topics have largely been overlooked in semantic communication (SemCom), which focuses on transmitting meaning rather than symbols, leading to significant improvements in communication efficiency. Despite these advantages, semantic errors -- stemming from discrepancies between transmitted and received meanings -- present a major challenge to system reliability. This paper addresses this gap by proposing a comprehensive framework for detecting and correcting semantic errors in SemCom systems. We formally define semantic error, detection, and correction mechanisms, and identify key sources of semantic errors. To address these challenges, we develop a Gaussian process (GP)-based method for latent space monitoring to detect errors, alongside a human-in-the-loop reinforcement learning (HITL-RL) approach to optimize semantic model configurations using user feedback. Experimental results validate the effectiveness of the proposed methods in mitigating semantic errors under various conditions, including adversarial attacks, input feature changes, physical channel variations, and user preference shifts. This work lays the foundation for more reliable and adaptive SemCom systems with robust semantic error management techniques.
Toward Mixture-of-Experts Enabled Trustworthy Semantic Communication for 6G Networks
He, Jiayi, Luo, Xiaofeng, Kang, Jiawen, Du, Hongyang, Xiong, Zehui, Chen, Ci, Niyato, Dusit, Shen, Xuemin
Semantic Communication (SemCom) plays a pivotal role in 6G networks, offering a viable solution for future efficient communication. Deep Learning (DL)-based semantic codecs further enhance this efficiency. However, the vulnerability of DL models to security threats, such as adversarial attacks, poses significant challenges for practical applications of SemCom systems. These vulnerabilities enable attackers to tamper with messages and eavesdrop on private information, especially in wireless communication scenarios. Although existing defenses attempt to address specific threats, they often fail to simultaneously handle multiple heterogeneous attacks. To overcome this limitation, we introduce a novel Mixture-of-Experts (MoE)-based SemCom system. This system comprises a gating network and multiple experts, each specializing in different security challenges. The gating network adaptively selects suitable experts to counter heterogeneous attacks based on user-defined security requirements. Multiple experts collaborate to accomplish semantic communication tasks while meeting the security requirements of users. A case study in vehicular networks demonstrates the efficacy of the MoE-based SemCom system. Simulation results show that the proposed MoE-based SemCom system effectively mitigates concurrent heterogeneous attacks, with minimal impact on downstream task accuracy.
Latent Diffusion Model-Enabled Real-Time Semantic Communication Considering Semantic Ambiguities and Channel Noises
Pei, Jianhua, Feng, Cheng, Wang, Ping, Tabassum, Hina, Shi, Dongyuan
Semantic communication (SemCom) has emerged as a new paradigm for 6G communication, with deep learning (DL) models being one of the key drives to shift from the accuracy of bit/symbol to the semantics and pragmatics of data. Nevertheless, DL-based SemCom systems often face performance bottlenecks due to overfitting, poor generalization, and sensitivity to outliers. Furthermore, the varying-fading gains and noises with uncertain signal-to-noise ratios (SNRs) commonly present in wireless channels usually restrict the accuracy of semantic information transmission. Consequently, this paper constructs a latent diffusion model-enabled SemCom system, and proposes three improvements compared to existing works: i) To handle potential outliers in the source data, semantic errors obtained by projected gradient descent based on the vulnerabilities of DL models, are utilized to update the parameters and obtain an outlier-robust encoder. ii) A lightweight single-layer latent space transformation adapter completes one-shot learning at the transmitter and is placed before the decoder at the receiver, enabling adaptation for out-of-distribution data and enhancing human-perceptual quality. iii) An end-to-end consistency distillation (EECD) strategy is used to distill the diffusion models trained in latent space, enabling deterministic single or few-step real-time denoising in various noisy channels while maintaining high semantic quality. Extensive numerical experiments across different datasets demonstrate the superiority of the proposed SemCom system, consistently proving its robustness to outliers, the capability to transmit data with unknown distributions, and the ability to perform real-time channel denoising tasks while preserving high human perceptual quality, outperforming the existing denoising approaches in semantic metrics.
Interplay of Semantic Communication and Knowledge Learning
Ni, Fei, Wang, Bingyan, Li, Rongpeng, Zhao, Zhifeng, Zhang, Honggang
In the swiftly advancing realm of communication technologies, Semantic Communication (SemCom), which emphasizes knowledge understanding and processing, has emerged as a hot topic. By integrating artificial intelligence technologies, SemCom facilitates a profound understanding, analysis and transmission of communication content. In this chapter, we clarify the means of knowledge learning in SemCom with a particular focus on the utilization of Knowledge Graphs (KGs). Specifically, we first review existing efforts that combine SemCom with knowledge learning. Subsequently, we introduce a KG-enhanced SemCom system, wherein the receiver is carefully calibrated to leverage knowledge from its static knowledge base for ameliorating the decoding performance. Contingent upon this framework, we further explore potential approaches that can empower the system to operate in evolving knowledge base more effectively. Furthermore, we investigate the possibility of integration with Large Language Models (LLMs) for data augmentation, offering additional perspective into the potential implementation means of SemCom. Extensive numerical results demonstrate that the proposed framework yields superior performance on top of the KG-enhanced decoding and manifests its versatility under different scenarios.